Character-based PSMT for Closely Related Languages

نویسنده

  • Jörg Tiedemann
چکیده

Translating unknown words between related languages using a character-based statistical machine translation model can be beneficial. In this paper, we describe a simple method to combine character-based models with standard word-based models to increase the coverage of a phrase-based SMT system. Using this approach, we can show a modest improvement when translating between Norwegian and Swedish. The potentials of applying character-based models to closely related languages is also illustrated by applying the character model on its own. The performance of such an approach is similar to the word-level baseline and closer to the reference in terms of string similarity.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining Word-Level and Character-Level Models for Machine Translation Between Closely-Related Languages

We propose several techniques for improving statistical machine translation between closely-related languages with scarce resources. We use character-level translation trained on n-gram-character-aligned bitexts and tuned using word-level BLEU, which we further augment with character-based transliteration at the word level and combine with a word-level translation model. The evaluation on Maced...

متن کامل

Linguistically Motivated Reordering Modeling for Phrase-Based Statistical Machine Translation

Word reordering is one of the most difficult aspects of Statistical Machine Translation (SMT), and an important factor of its quality and efficiency. While short and mediumrange reordering is reasonably handled by the phrase-based approach (PSMT), long-range reordering still represents a challenge for state-of-the-art PSMT systems. As a major cause of this problem, we point out the inadequacy o...

متن کامل

Character-Based Pivot Translation for Under-Resourced Languages and Domains

In this paper we investigate the use of character-level translation models to support the translation from and to underresourced languages and textual domains via closely related pivot languages. Our experiments show that these low-level models can be successful even with tiny amounts of training data. We test the approach on movie subtitles for three language pairs and legal texts for another ...

متن کامل

On The Communication Complexity of Perfectly Secure Message Transmission in Directed Networks

In this paper, we re-visit the problem of perfectly secure message transmission (PSMT) in a directed network under the presence of a threshold adaptive Byzantine adversary, having unbounded computing power. Desmedt et.al [5] have given the characterization for three or more phase PSMT protocols over directed networks. Recently, Patra et. al. [15] have given the characterization of two phase PSM...

متن کامل

Efficient Perfectly Reliable and Secure Communication Tolerating Mobile Adversary

We study the problem of Perfectly Reliable Message Transmission (PRMT) and Perfectly Secure Message Transmission (PSMT) between two nodes S and R in an undirected synchronous network, a part of which is under the influence of an all powerful mobile Byzantine adversary. In ACISP’2007 Srinathan et. al. has proved that the connectivity requirement for PSMT protocols is same for both static and mob...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009